20 research outputs found
Start Small: Training Game Level Generators from Nothing by Learning at Multiple Sizes
A procedural level generator is a tool that generates levels from noise. One
approach to build generators is using machine learning, but given the training
data rarity, multiple methods have been proposed to train generators from
nothing. However, level generation tasks tend to have sparse feedback, which is
commonly mitigated using game-specific supplemental rewards. This paper
proposes a novel approach to train generators from nothing by learning at
multiple level sizes starting from a small size up to the desired sizes. This
approach employs the observed phenomenon that feedback is denser at smaller
sizes to avoid supplemental rewards. It also presents the benefit of training
generators to output levels at various sizes. We apply this approach to train
controllable generators using generative flow networks. We also modify
diversity sampling to be compatible with generative flow networks and to expand
the expressive range. The results show that our methods can generate
high-quality diverse levels for Sokoban, Zelda and Danger Dave for a variety of
sizes, after only 3h 29min up to 6h 11min (depending on the game) of training
on a single commodity machine. Also, the results show that our generators can
output levels for sizes that were unavailable during training.Comment: 26 pages, 7 tables, 7 figures. Code:
https://github.com/yahiaetman/ms-level-ge
Literature review of procedural content generation in puzzle games
This is the third chapter from my Master Thesis (Automatic Game Generation). This
chapter will provide a review of the past work on Procedural Content Generation. It
highlights different efforts towards generating levels and rules for games. These efforts are
grouped according to their similarity and sorted chronologically within each group.N/
Context based clearing procedure: A niching method for genetic algorithms
AbstractIn this paper we present CBC (context based clearing), a procedure for solving the niching problem. CBC is a clearing technique governed by the amount of heterogeneity in a subpopulation as measured by the standard deviation. CBC was tested using the M7 function, a massively multimodal deceptive optimization function typically used for testing the efficiency of finding global optima in a search space. The results are compared with a standard clearing procedure. Results show that CBC reaches global optima several generations earlier than in the standard clearing procedure. In this work the target was to test the effectiveness of context information in controlling clearing. A subpopulation includes a fixed number of candidates rather than a fixed radius. Each subpopulation is then cleared either totally or partially according to the heterogeneity of its candidates. This automatically regulates the radius size of the area cleared around the pivot of the subpopulation
Automatic puzzle level generation : a general approach using a description language
In this paper, we present a general technique to generate and evaluate puzzle levels made by Puzzle Script. Puzzle Script is a videogame description language created by Stephen Lavelle for scripting puzzle games. We propose a system to help in generating levels for Puzzle Script without any restriction on the current game rules. Two different approaches are used with a trade off between speed (Constructive approach) and playability (Genetic approach). These two approaches use a level evaluator that calculates the scores of the generated levels based on their playability and challenge. The generated levels are assessed by human players statistically, and the results show that the constructive approach is capable of generating playable levels up to 90%, while genetic approach can reach up to 100%. The results also show a high correlation between the system scores and the human scores.peer-reviewe
Enhanced bag of words using multilevel k-means for human activity recognition
This paper aims to enhance the bag of features in order to improve the accuracy of human activity recognition. In this paper, human activity recognition process consists of four stages: local space time features detection, feature description, bag of features representation, and SVMs classification. The k-means step in the bag of features is enhanced by applying three levels of clustering: clustering per video, clustering per action class, and clustering for the final code book. The experimental results show that the proposed method of enhancement reduces the time and memory requirements, and enables the use of all training data in the k-means clustering algorithm. The evaluation of accuracy of action classification on two popular datasets (KTH and Weizmann) has been performed. In addition, the proposed method improves the human activity recognition accuracy by 5.57% on the KTH dataset using the same detector, descriptor, and classifier
Unsupervised domain adaptation with post-adaptation labeled domain performance preservation
Unsupervised domain adaptation is a machine learning-oriented application that aims to transfer knowledge learned from a seen (source) domain with labeled data to an unseen (target) domain with only unlabeled data. Recently developed techniques apply adversarial learning to learn domain-transferable features. However, current adversarial domain adaptation models suffer from the training instability of adversarial networks. Furthermore, it is unclear what the source domain pays in terms of performance during learning the domain-transferable representation. To address this issue, we propose a novel approach termed Unsupervised DomainAdaptation with Source Preservation (UDA-SP). It shares the same objective of obtaining a generalization representation between different distributions as domain adaptation techniques. Additionally, it has the new objective of preserving efficient performance in the source domain. This is accomplished by learning representations of shared and source-specific features that are separately learned from two distinct networks. Then, they are concatenated with available class information to train a new classifier that has the ability to exploit both shared and domain-specific features. We conducted a comprehensive experimental analysis on three benchmark text datasets. Experiments validate that our proposed method outperforms their competing state-of-the-art methods. Further experiments demonstrate that UDA-SP has a good ability to generalize learned knowledge to unseen domains while maintaining seen domain performance
An Optimized Dual Classification System for Arabic Extractive Generic Text Summarization
Summarization is the process of producing shorter presentation of the most important information from a source or multiple sources of information according to particular needs. With summaries, we can make effective decisions and get useful information in less time. This paper introduces an Arabic extractive text summarization system. This system integrates Bayesian and Genetic Programming (GP) classification methods in an optimized way to extract the summary sentences. The system is trainable and uses manually labeled corpus. Features for each sentence are extracted based on Arabic morphological analysis and part of speech tags in addition to simple position and counting methods. Initial set of features is examined and reduced to an optimized and discriminative subset of features. Given human generated summaries, the system is evaluated in terms of recall, precision and F-measure
Generic Symbolic Music Labeling Pipeline
The availability of large datasets is an essential key factor for machine learning success. However, for symbolic music datasets, while there are many symbolic music files available, labeled datasets are scarce in many applications. In this paper, we propose a general pipeline for symbolic music labeling. The input to the pipeline is unlabeled midi files without particular constraints. Firstly, the pipeline filters the input and splits it into time-limited musical segments. Secondly, the pipeline generates intermediate labels using multiple pre-trained models, neural networks, and heuristics. Finally, multiple methods are used to combine intermediate labels to generate final labels. A label is accepted only if it exceeds a certain confidence level. To test the pipeline, we apply it to label a new piano difficulty dataset, “PianoDiff”. We provide a thorough analysis to facilitate its usage in piano difficulty estimation for classification and generation using machine learning approaches. We test our pipeline on a dataset with manual labels. A random forest model trained on the weakly labeled dataset achieves an F1 score with a relative improvement of 13 percentage points compared to the same model trained on a smaller manually labeled dataset